Improved surname pronunciations using decision trees
نویسندگان
چکیده
Proper noun pronuncia t ion genera t ion is a part icular ly chal lenging problem in speech recognition since a large percentage of proper nouns often defy typical letter-to-sound conversion rules. In this paper, we present decision tree methods which outperform neural network techniques. Using the decision tree method, we have achieved an overall error rate of 45.5%, which is a 35% reduction over the previous techniques. Our best system is a binary decision tree that uses a context length of 3 and employs information gain ratio as the splitting rule.
منابع مشابه
Effects of Speaking Rate and Word Frequency on Conversational Pronunciations
The possible set of pronunciations in continuous speech corpora change dynamically with many factors. Two variables , speaking rate and word predictability, seemed to be promising candidates for integration into dynamic ASR pronunciation models; however, our initial eeorts to incorporate these factors into phone-level decision tree models met with limited success. In this paper, we connrm the i...
متن کاملMulti-level decision trees for static and dynamic pronunciation models
We have been focusing on improving pronunciation models for automatic transcription of television and radio news reports by modeling phone, syllable, and word pronunciation distributions with decision trees. These models were employed in two separate sets of experiments. First, decision trees facilitated selection of word pronunciations derived automatically from data for use in a standard spee...
متن کاملRescoring multiple pronunciations generated from spelled words
Building on earlier work [2], we show how a set of binary decision trees grown by means of the Gelfand-Ravishankar-Delp algorithm [8] can be trained to generate an ordered list of possible pronunciations from a spelled word. Training is carried out on a database consisting of spelled words paired with their pronunciations (in a particular language). We show how phonotactic information can be le...
متن کاملModeling pronunciation variation with context-dependent articulatory feature decision trees
We consider the problem of predicting the surface pronunciations of a word in conversational speech, using a model of pronunciation variation based on articulatory features. We build context-dependent decision trees for both phone-based and feature-based models, and compare their perplexities on conversational data from the Switchboard Transcription Project. We find that a fully-factored model,...
متن کاملFlavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment
The most common target of multilingual ASR aims at covering various speakers from various languages. The problem we address in this article is more specifically an asymmetrical bilingual scenario, where the same speaker may insert in his speech some foreign words using foreign pronunciations. This is a frequent situation for French as spoken in Canada, where English proper names are often spoke...
متن کامل